[Infer] Add head_dim=96 dispatch for block attention #70763

zeroRains · 2025-01-10T02:32:21Z

PR Category

Inference

PR Types

Bug fixes

Description

为block attention添加head_dim=96的dispatch

复现方式：PaddleNLP中使用下面的指令

cd llm
# block attention
python ./predict/predictor.py --model_name_or_path bigscience/bloom-1b1 --dtype float16 --mode dynamic --decode_strategy greedy_search --inference_model 1 --block_attn 1

# normal attention
python ./predict/predictor.py --model_name_or_path bigscience/bloom-1b1 --dtype float16 --mode dynamic --decode_strategy greedy_search --inference_model 1 --block_attn 0

# block 的指令与normal指令的计算结果相同

paddle-bot · 2025-01-10T02:32:25Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle-bot bot added the contributor External developers label Jan 10, 2025

zeroRains requested a review from SigureMo as a code owner January 11, 2025 09:05

add the head_dim=96 dispatch for block attention

dc14f43

zeroRains force-pushed the block_attn branch from 7a0f294 to dc14f43 Compare January 11, 2025 09:08

yuanlehome approved these changes Jan 14, 2025

View reviewed changes

yuanlehome merged commit 825e6ec into PaddlePaddle:develop Jan 14, 2025
33 checks passed

zeroRains mentioned this pull request Jan 14, 2025

[Infer] change block attention as default attention and remove normal attention in inference mode PaddlePaddle/PaddleNLP#9770

Open

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Infer] Add head_dim=96 dispatch for block attention #70763

[Infer] Add head_dim=96 dispatch for block attention #70763

zeroRains commented Jan 10, 2025 •

edited

Loading

paddle-bot bot commented Jan 10, 2025

[Infer] Add head_dim=96 dispatch for block attention #70763

[Infer] Add head_dim=96 dispatch for block attention #70763

Conversation

zeroRains commented Jan 10, 2025 • edited Loading

PR Category

PR Types

Description

paddle-bot bot commented Jan 10, 2025

zeroRains commented Jan 10, 2025 •

edited

Loading